22 research outputs found

    Attention-Based End-to-End Speech Recognition on Voice Search

    Full text link
    Recently, there has been a growing interest in end-to-end speech recognition that directly transcribes speech to text without any predefined alignments. In this paper, we explore the use of attention-based encoder-decoder model for Mandarin speech recognition on a voice search task. Previous attempts have shown that applying attention-based encoder-decoder to Mandarin speech recognition was quite difficult due to the logographic orthography of Mandarin, the large vocabulary and the conditional dependency of the attention model. In this paper, we use character embedding to deal with the large vocabulary. Several tricks are used for effective model training, including L2 regularization, Gaussian weight noise and frame skipping. We compare two attention mechanisms and use attention smoothing to cover long context in the attention model. Taken together, these tricks allow us to finally achieve a character error rate (CER) of 3.58% and a sentence error rate (SER) of 7.43% on the MiTV voice search dataset. While together with a trigram language model, CER and SER reach 2.81% and 5.77%, respectively

    A meta learning scheme for fast accent domain expansion in Mandarin speech recognition

    Full text link
    Spoken languages show significant variation across mandarin and accent. Despite the high performance of mandarin automatic speech recognition (ASR), accent ASR is still a challenge task. In this paper, we introduce meta-learning techniques for fast accent domain expansion in mandarin speech recognition, which expands the field of accents without deteriorating the performance of mandarin ASR. Meta-learning or learn-to-learn can learn general relation in multi domains not only for over-fitting a specific domain. So we select meta-learning in the domain expansion task. This more essential learning will cause improved performance on accent domain extension tasks. We combine the methods of meta learning and freeze of model parameters, which makes the recognition performance more stable in different cases and the training faster about 20%. Our approach significantly outperforms other methods about 3% relatively in the accent domain expansion task. Compared to the baseline model, it improves relatively 37% under the condition that the mandarin test set remains unchanged. In addition, it also proved this method to be effective on a large amount of data with a relative performance improvement of 4% on the accent test set

    Key Frame Mechanism For Efficient Conformer Based End-to-end Speech Recognition

    Full text link
    Recently, Conformer as a backbone network for end-to-end automatic speech recognition achieved state-of-the-art performance. The Conformer block leverages a self-attention mechanism to capture global information, along with a convolutional neural network to capture local information, resulting in improved performance. However, the Conformer-based model encounters an issue with the self-attention mechanism, as computational complexity grows quadratically with the length of the input sequence. Inspired by previous Connectionist Temporal Classification (CTC) guided blank skipping during decoding, we introduce intermediate CTC outputs as guidance into the downsampling procedure of the Conformer encoder. We define the frame with non-blank output as key frame. Specifically, we introduce the key frame-based self-attention (KFSA) mechanism, a novel method to reduce the computation of the self-attention mechanism using key frames. The structure of our proposed approach comprises two encoders. Following the initial encoder, we introduce an intermediate CTC loss function to compute the label frame, enabling us to extract the key frames and blank frames for KFSA. Furthermore, we introduce the key frame-based downsampling (KFDS) mechanism to operate on high-dimensional acoustic features directly and drop the frames corresponding to blank labels, which results in new acoustic feature sequences as input to the second encoder. By using the proposed method, which achieves comparable or higher performance than vanilla Conformer and other similar work such as Efficient Conformer. Meantime, our proposed method can discard more than 60\% useless frames during model training and inference, which will accelerate the inference speed significantly. This work code is available in {https://github.com/scufan1990/Key-Frame-Mechanism-For-Efficient-Conformer}Comment: This manuscript has been accepted by IEEE Signal Processing Letters for publicatio

    Comparison Study of Wide Bandgap Polymer (PBDB-T) and Narrow Bandgap Polymer (PBDTTT-EFT) as Donor for Perylene Diimide Based Polymer Solar Cells

    Get PDF
    Perylene diimide (PDI) derivatives as a kind of promising non-fullerene-based acceptor (NFA) have got rapid development. However, most of the relevant developmental work has focused on synthesizing novel PDI-based structures, and few paid attentions to the selection of the polymer donor in PDI-based solar cells. Wide bandgap polymer (PBDB-T) and narrow bandgap polymer (PBDTTT-EFT) are known as the most efficient polymer donors in polymer solar cells (PSCs). While PBDB-T is in favor with non-fullerene acceptors achieving power conversion efficiency (PCE) more than 12%, PBDTTT-EFT is one of the best electron donors with fullerene acceptors with PCE up to 10%. Despite the different absorption profiles, the working principle of these benchmark polymer donors with a same electron acceptor, specially PDI-based acceptors, was rarely compared. To this end, we used PBDB-T and PBDTTT-EFT as the electron donors, and 1,1′-bis(2-methoxyethoxyl)-7,7′-(2,5-thienyl) bis-PDI (Bis-PDI-T-EG) as the electron acceptor to fabricate PSCs, and systematically compared their differences in device performance, carrier mobility, recombination mechanism, and film morphology

    Benefits and risks of the hormetic effects of dietary isothiocyanates on cancer prevention

    Get PDF
    The isothiocyanate (ITC) sulforaphane (SFN) was shown at low levels (1-5 µM) to promote cell proliferation to 120-143% of the controls in a number of human cell lines, whilst at high levels (10-40 µM) it inhibited such cell proliferation. Similar dose responses were observed for cell migration, i.e. SFN at 2.5 µM increased cell migration in bladder cancer T24 cells to 128% whilst high levels inhibited cell migration. This hormetic action was also found in an angiogenesis assay where SFN at 2.5 µM promoted endothelial tube formation (118% of the control), whereas at 10-20 µM it caused significant inhibition. The precise mechanism by which SFN influences promotion of cell growth and migration is not known, but probably involves activation of autophagy since an autophagy inhibitor, 3-methyladenine, abolished the effect of SFN on cell migration. Moreover, low doses of SFN offered a protective effect against free-radical mediated cell death, an effect that was enhanced by co-treatment with selenium. These results suggest that SFN may either prevent or promote tumour cell growth depending on the dose and the nature of the target cells. In normal cells, the promotion of cell growth may be of benefit, but in transformed or cancer cells it may be an undesirable risk factor. In summary, ITCs have a biphasic effect on cell growth and migration. The benefits and risks of ITCs are not only determined by the doses, but are affected by interactions with Se and the measured endpoint

    Study on Damage Characteristics of Water-Bearing Coal Samples under Cyclic Loading–Unloading

    No full text
    For underground water reservoirs in coal mines, the complex water-rich environment and changing overburden stress can damage coal pillar dams. In this paper, the coal samples from coal seam 22 of Shangwan coal mine were taken as research objects and the damage mechanism and characteristics of coal samples with different moisture content and wetting-drying cycles under cyclic loading were investigated. The results show that as the moisture content and wetting-drying cycles increase, the post-peak stage of the coal samples under cyclic stress becomes obvious, and the hysteresis loop changes from dense to sparse. Compared to the uniaxial compression experiment, when w = 5.28% (the critical water content), mechanical parameters such as peak strength and modulus of elasticity decrease the most. Under cyclic loading, the damage mode of both sets of coal samples was tensile damage, but the increase in wetting-drying cycles promotes the development of shear fractures. For evaluating fracture types, the RA-AF density map is more applicable to wetting-drying cycle coal samples, whereas for the coal samples with different moisture contents this should be carried out with caution. This study can provide some theoretical basis for the stability evaluation of coal pillar dams in underground water reservoirs

    Numerical Investigation on the Yield Pillar Bearing Capacity under the Two-End-Type Cable Reinforcement

    No full text
    For underground coal mining techniques such as gob-side entry retaining (GER) or gob-side entry driving (GED), the stability of yield pillars is paramount. A well-designed yield pillar aims to withstand mining-induced stresses. This study delves into the impact of bi-terminal cable support on the stability of such pillars. Utilizing 30 distinct numerical models, each with varying pillar width/height (w/h) ratios and diverse cable support methodologies, our findings suggest an upward trend in both peak and residual strength in response to heightened support strength. Notably, pillars with a wider configuration exhibited a more pronounced increase in peak strength compared to their narrower counterparts, while the latter showcased a more pronounced residual strength enhancement. Additionally, the residual/peak strength ratio was smaller in narrower pillars and increased with the increase in the cable support strength. In view of the surrounding rock mass’s support stress distribution, numerical modelling was adopted to analyze the underlying support mechanism. The results showed the support stress zones extended farther on both sides of pillars with the decrease in the row spacing, which made the radial stresses rise effectively and ameliorated the coal pillar’s stress state. Finally, with the 8311 operation advancing towards the station, the deformation amplitude of the coal pillar was only 2.28%, and the stability of the coal pillar was effectively maintained
    corecore